second-order critical point
- North America > United States > California > Alameda County > Berkeley (0.14)
- Asia > Middle East > Jordan (0.04)
- North America > Canada (0.04)
- North America > United States > Illinois > Cook County > Evanston (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Afghanistan > Parwan Province > Charikar (0.04)
- North America > United States > California > Alameda County > Berkeley (0.14)
- Asia > Middle East > Jordan (0.04)
31784d9fc1fa0d25d04eae50ac9bf787-Paper.pdf
Indeedin learning applications, where symmetric tensors areformed from statistical moments (higher-order covariances) or multivariate derivatives (higher-order Hessians), CP decomposition has enabled parameter estimation for mixtures of Gaussians [20, 35], generalized linear models [34], shallow neuralnetworks[19,24,42],deepernetworks[17,18,30],hiddenMarkovmodels[5],amongothers.
The non-convex Burer-Monteiro approach works on smooth semidefinite programs Nicolas Boumal
Semidefinite programs (SDPs) can be solved in polynomial time by interior point methods, but scalability can be an issue. To address this shortcoming, over a decade ago, Burer and Monteiro proposed to solve SDPs with few equality constraints via rank-restricted, non-convex surrogates. Remarkably, for some applications, local optimization methods seem to converge to global optima of these non-convex surrogates reliably. Although some theory supports this empirical success, a complete explanation of it remains an open question. In this paper, we consider a class of SDPs which includes applications such as max-cut, community detection in the stochastic block model, robust PCA, phase retrieval and synchronization of rotations. We show that the low-rank Burer-Monteiro formulation of SDPs in that class almost never has any spurious local optima. This paper was corrected on April 9, 2018. Theorems 2 and 4 had the assumption that M (1) is a manifold.
- North America > United States > New Jersey > Mercer County > Princeton (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
- North America > United States > California > Alameda County > Berkeley (0.14)
- Asia > Middle East > Jordan (0.04)
- North America > Canada (0.04)
- Africa > Senegal > Kolda Region > Kolda (0.05)
- North America > Mexico > Oaxaca (0.04)
Scalable Second-order Riemannian Optimization for $K$-means Clustering
Xu, Peng, Hou, Chun-Ying, Chen, Xiaohui, Zhang, Richard Y.
Clustering is a hard discrete optimization problem. Nonconvex approaches such as low-rank semidefinite programming (SDP) have recently demonstrated promising statistical and local algorithmic guarantees for cluster recovery. Due to the combinatorial structure of the $K$-means clustering problem, current relaxation algorithms struggle to balance their constraint feasibility and objective optimality, presenting tremendous challenges in computing the second-order critical points with rigorous guarantees. In this paper, we provide a new formulation of the $K$-means problem as a smooth unconstrained optimization over a submanifold and characterize its Riemannian structures to allow it to be solved using a second-order cubic-regularized Riemannian Newton algorithm. By factorizing the $K$-means manifold into a product manifold, we show how each Newton subproblem can be solved in linear time. Our numerical experiments show that the proposed method converges significantly faster than the state-of-the-art first-order nonnegative low-rank factorization method, while achieving similarly optimal statistical accuracy.
- North America > United States > Illinois (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (4 more...)